pygor3 | pygor3 repository

by statbiophys Jupyter Notebook Version: v0.3.2 License: Non-SPDX

X-Ray Key Features Code Snippets Community Discussions Vulnerabilities Install Support

kandi X-RAY | pygor3 Summary

pygor3 is a Jupyter Notebook library. pygor3 has no bugs, it has no vulnerabilities and it has low support. However pygor3 has a Non-SPDX License. You can download it from GitHub.

pygor3 repository

Support

Quality

Security

License

Reuse

Support

pygor3 has a low active ecosystem.

It has 8 star(s) with 3 fork(s). There are 5 watchers for this library.

It had no major release in the last 6 months.

There are 4 open issues and 0 have been closed. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of pygor3 is v0.3.2

Quality

pygor3 has no bugs reported.

Security

pygor3 has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

pygor3 has a Non-SPDX License.

Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

Reuse

pygor3 releases are not available. You will need to build from source code and install.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of pygor3

Get all kandi verified functions for this library.

pygor3 Key Features

No Key Features are available at this moment for pygor3.

pygor3 Examples and Code Snippets

No Code Snippets are available at this moment for pygor3.

Community Discussions

No Community Discussions are available at this moment for pygor3.Refer to stack overflow page for discussions.

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install pygor3

Pygor3 uses IGoR’s as a background program to infer, evaluate and infer models. First install IGoR, if is not already installed in your system.
IGoR can be installed using the github repository [IGoR](https://github.com/statbiophys/IGoR), pre-compiled images or by using a docker image.
(Optional, but recommended) Install [conda](https://docs.conda.io/en/latest/) or [anaconda](https://www.anaconda.com/) and create (or use ) a virtual environment. (Optional, but recommended) To use the demo notebooks we recommend to use jupyter-lab. Pygor can be installed from [PyPi](https://pypi.org/) repository using the package manager [pip](https://pip.pypa.io/en/stable/). For the most version of pygor (from Github) console (statbiophys) $ git clone https://github.com/statbiophys/pygor3.git (statbiophys) $ cd pygor3 (statbiophys) $ pip install -e . If installation from sources was successful pygor3 should recognize automatically the binary executable ("paths.igor_exec") and the default data directory("paths.igor_data"). If not you could use this configuration to personalize the paths that pygor3 looks for IGoR. To get the path of the configuration file. This file is usually located in ${HOME}/.local/share/pygor3/config.json. the variable "paths.igor_exec" is the executable path and "paths.igor_data" is the directory where pygor3 will look for the default IGoR’s models. and "paths.igor_data" is the path of the root directory to look for models directory.
Download a copy of demo sequences in current directory. This command creates a directory demo with the following structure, with sequences to infer and evaluate a new model. To use the demo notebooks change directory to notebooks and run jupyter-lab. For the command line tools change to directory data/IgL. Now to create a model from scratch, download gene templates and anchors from IMGT website [IMGT](http://www.imgt.org/) A list of available species to download from IMGT can be queried with imgt-get-genomes command and option --info. A model can be plotted from a database file, model directory or by passing the model_parms.txt and model_marginals.txt. This will output two pdf files with the Marginal Probabilities and Conditional probabilities of events. The .db files can contain all the information in IGoR’s standard files in a single sqlite database file, and can be examinated with any sqlite client, like sqlite3 or sqlitebrowser. Pygor has its own methods to maniputate data a database file. For instance, db-ls list the contents of the database and the number of records. In a similar way the commands db-rm, db-cp, db-import and db-export can be used to manipulate database files. Once we have an inferred model we can evaluate the probability of a particular sequence to be generated (pgen) and get the most probable scenarios for the recombination of input sequences or generate synthetic sequences. We can evaluate sequences using the following files: model_parms.txt, model_marginals.txt, genomicXs.fasta, and X_gene_CDR3_anchors.csv or we can just use a database file with all this information like above. For instance the "new_IgL_naive_mdl.db" file, in the example above, contains only the model and genomes information, which is necessary for the alignment and evaluation for IGoR. An tsv airr standard format is created with the rearragement.
Download genomic templates using VJ or VDJ corresponding to the type of chain. ```console (statbiophys) $ pygor imgt-get-genomes --imgt-species Homo+sapiens --imgt-chain IGL -t VJ -------------------------------- http://www.imgt.org get_ref_genome Homo+sapiens IGLV http://www.imgt.org/genedb/GENElect?query=7.2+IGLV&species=Homo+sapiens http://www.imgt.org/genedb/GENElect?query=7.2+IGLV&species=Homo+sapiens Homo+sapiens IGLJ http://www.imgt.org/genedb/GENElect?query=7.2+IGLJ&species=Homo+sapiens http://www.imgt.org/genedb/GENElect?query=7.2+IGLJ&species=Homo+sapiens http://www.imgt.org/genedb/GENElect?query=8.1+IGLV&species=Homo+sapiens&IMGTlabel=2nd-CYS No anchor is found for : AC279423|IGLV(I)-11-1*01|Homo sapiens|P|V-REGION|22452..22620|169 nt|1| | | | |169+0=169|partial in 5'| | No anchor is found for : D87007|IGLV(I)-20*01|Homo sapiens|P|V-REGION|15573..15858|286 nt|1| | | | |286+0=286| | | No anchor is found for : AC279208|IGLV(I)-20*02|Homo sapiens|P|V-REGION|19943..20228|286 nt|1| | | | |286+0=286| | | ... Number of features: 0 Seq('TGCTGTGTTCGGAGGAGGCACCCAGCTGACCGTCCTCG') ID: D87017|IGLJ7*02|Homo Name: D87017|IGLJ7*02|Homo Description: D87017|IGLJ7*02|Homo sapiens|F|J-REGION|18513..18550|38 nt|2| | | | |38+0=38| | | Number of features: 0 Seq('TGCTGTGTTCGGAGGAGGCACCCAGCTGACCGCCCTCG') ---------------------- Genomic VJ templates in files: models/Homo+sapiens/IGL/ref_genome/genomicVs__imgt.fasta models/Homo+sapiens/IGL/ref_genome/genomicJs__imgt.fasta ``` This command creates a directory **models** with the following structure ``` models/ └── Homo+sapiens └── TRB ├── models └── ref_genome ├── genomicDs.fasta ├── genomicDs__imgt.fasta ├── genomicDs__imgt.fasta_short ├── genomicJs.fasta ├── genomicJs__imgt.fasta ├── genomicJs__imgt.fasta_short ├── genomicJs__imgt.fasta_trim ├── genomicVs.fasta ├── genomicVs__imgt.fasta ├── genomicVs__imgt.fasta_short ├── genomicVs__imgt.fasta_trim ├── J_gene_CDR3_anchors.csv ├── J_gene_CDR3_anchors__imgt.csv ├── J_gene_CDR3_anchors__imgt.csv_short ├── V_gene_CDR3_anchors.csv ├── V_gene_CDR3_anchors__imgt.csv └── V_gene_CDR3_anchors__imgt.csv_short ``` --- **Important Note** It is important to review carefully your downloaded genes templates. Pygor automatically rename from long IMGT descriptions to a short ones. For instance D86996|IGLV(I)-56*01|Homo sapiens|P|V-REGION|12276..12571|296 nt|1| | | | |296+0=296| | | D86996|IGLV(I)-56*01|Homo sapiens|P|V-REGION|12576..12876|301 nt|1| | | | |301+0=301| | | Are renamed as : IGLV(I)-56*01 IGLV(I)-56*01 For these cases, is important to rename it or remove it manually, before create a new model. For simplicity in this demo we remove the second IGLV(I)-56*01 ---
Create a new initial default model, with uniform distribution for the conditional probabilities of Bayes network ("model_marginals.txt" file). Notice that in IGoR this file is called marginals, but it is not the marginal probability of a recombination event. ```console (statbiophys) $ pygor model-create -M models/Homo+sapiens/IGL/ -t VJ -------------------------------- No D genes were found. [Errno 2] No such file or directory: 'models/Homo+sapiens/IGL//ref_genome//genomicDs.fasta' No D genes were found. [Errno 2] No such file or directory: 'models/Homo+sapiens/IGL//ref_genome//genomicDs.fasta' igortask.igor_model_dir_path: models/Homo+sapiens/IGL/ Writing model parms in file models/Homo+sapiens/IGL//models/model_parms.txt Writing model marginals in file models/Homo+sapiens/IGL//models/model_marginals.txt ``` Initial models with uniform parameters model files will be created in files **model_parms.txt** and **model_marginals.txt** at directory path ```console models/ └── Homo+sapiens └── IGL ├── models │ ├── model_marginals.txt │ └── model_parms.txt └── ref_genome ├── genomicJs.fasta ├── genomicJs__imgt.fasta ├── genomicJs__imgt.fasta_short ├── genomicJs__imgt.fasta_trim ├── genomicVs.fasta ├── genomicVs__imgt.fasta ├── genomicVs__imgt.fasta_short ├── genomicVs__imgt.fasta_trim ├── J_gene_CDR3_anchors.csv ├── J_gene_CDR3_anchors__imgt.csv ├── J_gene_CDR3_anchors__imgt.csv_short ├── V_gene_CDR3_anchors.csv ├── V_gene_CDR3_anchors__imgt.csv └── V_gene_CDR3_anchors__imgt.csv_short ``` At this point you can use a set of non-productive sequence to infer a model within IGoR directly or by using pygor command (the simpler option). ```console (statbiophys) $ pygor igor-infer -M models/Homo+sapiens/IGL/ -i data/IgL/IgL_seqs_naive_Nofunctional.txt -o new_IgL_naive -------------------------------- ===== Running inference ===== ... WARNING: write_model_parms path [Errno 2] No such file or directory: '' Writing model parms in file new_IgL_naive_parms.txt WARNING: IgorModel_Marginals.write_model_marginals path [Errno 2] No such file or directory: '' Writing model marginals in file new_IgL_naive_marginals.txt Database file : new_IgL_naive ``` This will output the following files ```console new_IgL_naive.db new_IgL_naive_BN.pdf new_IgL_naive_PM.pdf new_IgL_naive_marginals.txt new_IgL_naive_parms.txt ``` where new_hs_trb.db is a database with the encapsulated information about the new model and the date used by IGoR to infer it, new_IgL_naive_BN.pdf is a plot of the Bayesian network(BN) of inferred model, new_IgL_naive_PM.pdf are plots of the real marginals of events in BN, and finally the new_IgL_naive_parms.txt and new_IgL_naive_marginals.txt the inferred model in IGoR's format.

Support

All the command line interface commands can be used in a python environment, like jupyter notebook, by exporting the pygor3 package. For further details checkout the [documentation](https://pygor3.readthedocs.io/en/latest/) and notebooks directory.

Find more information at: